Overview

Dataset statistics

Number of variables20
Number of observations5008
Missing cells9462
Missing cells (%)9.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.0 MiB
Average record size in memory1.0 KiB

Variable types

DateTime1
Unsupported1
Categorical10
Numeric8

Alerts

Location has a high cardinality: 4124 distinct valuesHigh cardinality
Operator has a high cardinality: 2267 distinct valuesHigh cardinality
flight_no has a high cardinality: 892 distinct valuesHigh cardinality
Route has a high cardinality: 3838 distinct valuesHigh cardinality
Ac Type has a high cardinality: 2468 distinct valuesHigh cardinality
Registration has a high cardinality: 4700 distinct valuesHigh cardinality
cn_ln has a high cardinality: 3907 distinct valuesHigh cardinality
Summary has a high cardinality: 4857 distinct valuesHigh cardinality
Country has a high cardinality: 486 distinct valuesHigh cardinality
All Aboard is highly overall correlated with Passengers Board and 3 other fieldsHigh correlation
Passengers Board is highly overall correlated with All Aboard and 3 other fieldsHigh correlation
Crew Board is highly overall correlated with All Aboard and 3 other fieldsHigh correlation
All Fatalities is highly overall correlated with All Aboard and 4 other fieldsHigh correlation
Passenger Fatalities is highly overall correlated with All Aboard and 2 other fieldsHigh correlation
Crew fatalities is highly overall correlated with Crew Board and 1 other fieldsHigh correlation
Time has 1504 (30.0%) missing valuesMissing
flight_no has 3682 (73.5%) missing valuesMissing
Route has 762 (15.2%) missing valuesMissing
Registration has 272 (5.4%) missing valuesMissing
cn_ln has 667 (13.3%) missing valuesMissing
Passengers Board has 221 (4.4%) missing valuesMissing
Crew Board has 219 (4.4%) missing valuesMissing
Passenger Fatalities has 235 (4.7%) missing valuesMissing
Crew fatalities has 235 (4.7%) missing valuesMissing
Summary has 59 (1.2%) missing valuesMissing
Hour has 1504 (30.0%) missing valuesMissing
Ground is highly skewed (γ1 = 48.98678629)Skewed
Location is uniformly distributedUniform
Registration is uniformly distributedUniform
cn_ln is uniformly distributedUniform
Summary is uniformly distributedUniform
Time is an unsupported type, check if it needs cleaning or further analysisUnsupported
Passengers Board has 869 (17.4%) zerosZeros
All Fatalities has 76 (1.5%) zerosZeros
Passenger Fatalities has 1040 (20.8%) zerosZeros
Crew fatalities has 400 (8.0%) zerosZeros
Ground has 4716 (94.2%) zerosZeros
Hour has 74 (1.5%) zerosZeros

Reproduction

Analysis started2023-05-22 15:51:13.313380
Analysis finished2023-05-22 15:51:24.783185
Duration11.47 seconds
Software versionpandas-profiling v3.6.6
Download configurationconfig.json

Variables

Date
Date

Distinct4577
Distinct (%)91.4%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
Minimum1908-09-17 00:00:00
Maximum2021-07-06 00:00:00
2023-05-22T10:51:24.914164image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:25.059268image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Time
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing1504
Missing (%)30.0%
Memory size250.4 KiB

Location
Categorical

HIGH CARDINALITY  UNIFORM 

Distinct4124
Distinct (%)82.4%
Missing5
Missing (%)0.1%
Memory size420.3 KiB
Moscow, Russia
 
16
Manila, Philippines
 
15
New York, New York
 
14
Sao Paulo, Brazil
 
13
Cairo, Egypt
 
13
Other values (4119)
4932 

Length

Max length72
Median length49
Mean length20.812712
Min length5

Characters and Unicode

Total characters104126
Distinct characters90
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3687 ?
Unique (%)73.7%

Sample

1st rowFort Myer, Virginia
2nd rowJuvisy-sur-Orge, France
3rd rowAtlantic City, New Jersey
4th rowVictoria, British Columbia, Canada
5th rowOver the North Sea

Common Values

ValueCountFrequency (%)
Moscow, Russia 16
 
0.3%
Manila, Philippines 15
 
0.3%
New York, New York 14
 
0.3%
Sao Paulo, Brazil 13
 
0.3%
Cairo, Egypt 13
 
0.3%
Bogota, Colombia 12
 
0.2%
Rio de Janeiro, Brazil 12
 
0.2%
Near Moscow, Russia 11
 
0.2%
Chicago, Illinois 11
 
0.2%
Tehran, Iran 10
 
0.2%
Other values (4114) 4876
97.4%

Length

2023-05-22T10:51:25.248226image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
near 1350
 
9.2%
off 350
 
2.4%
russia 255
 
1.7%
new 229
 
1.6%
brazil 176
 
1.2%
colombia 153
 
1.0%
canada 131
 
0.9%
france 127
 
0.9%
california 117
 
0.8%
mexico 113
 
0.8%
Other values (4153) 11652
79.5%

Most occurring characters

ValueCountFrequency (%)
a 13037
 
12.5%
9703
 
9.3%
e 7073
 
6.8%
i 6567
 
6.3%
n 6545
 
6.3%
r 6035
 
5.8%
o 5367
 
5.2%
, 5210
 
5.0%
l 4000
 
3.8%
s 3530
 
3.4%
Other values (80) 37059
35.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 74113
71.2%
Uppercase Letter 14738
 
14.2%
Space Separator 9704
 
9.3%
Other Punctuation 5357
 
5.1%
Dash Punctuation 105
 
0.1%
Decimal Number 66
 
0.1%
Control 21
 
< 0.1%
Close Punctuation 11
 
< 0.1%
Open Punctuation 11
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 13037
17.6%
e 7073
9.5%
i 6567
8.9%
n 6545
8.8%
r 6035
 
8.1%
o 5367
 
7.2%
l 4000
 
5.4%
s 3530
 
4.8%
t 3112
 
4.2%
u 2756
 
3.7%
Other values (31) 16091
21.7%
Uppercase Letter
ValueCountFrequency (%)
N 2032
13.8%
C 1456
 
9.9%
S 1145
 
7.8%
M 999
 
6.8%
B 952
 
6.5%
A 920
 
6.2%
P 787
 
5.3%
I 720
 
4.9%
R 652
 
4.4%
O 588
 
4.0%
Other values (17) 4487
30.4%
Decimal Number
ValueCountFrequency (%)
0 24
36.4%
1 15
22.7%
2 9
 
13.6%
5 8
 
12.1%
8 3
 
4.5%
7 2
 
3.0%
3 2
 
3.0%
9 2
 
3.0%
6 1
 
1.5%
Other Punctuation
ValueCountFrequency (%)
, 5210
97.3%
. 115
 
2.1%
' 24
 
0.4%
/ 6
 
0.1%
& 1
 
< 0.1%
: 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
9703
> 99.9%
  1
 
< 0.1%
Control
ValueCountFrequency (%)
16
76.2%
5
 
23.8%
Dash Punctuation
ValueCountFrequency (%)
- 105
100.0%
Close Punctuation
ValueCountFrequency (%)
) 11
100.0%
Open Punctuation
ValueCountFrequency (%)
( 11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 88851
85.3%
Common 15275
 
14.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 13037
14.7%
e 7073
 
8.0%
i 6567
 
7.4%
n 6545
 
7.4%
r 6035
 
6.8%
o 5367
 
6.0%
l 4000
 
4.5%
s 3530
 
4.0%
t 3112
 
3.5%
u 2756
 
3.1%
Other values (58) 30829
34.7%
Common
ValueCountFrequency (%)
9703
63.5%
, 5210
34.1%
. 115
 
0.8%
- 105
 
0.7%
0 24
 
0.2%
' 24
 
0.2%
16
 
0.1%
1 15
 
0.1%
) 11
 
0.1%
( 11
 
0.1%
Other values (12) 41
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 104084
> 99.9%
None 42
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 13037
 
12.5%
9703
 
9.3%
e 7073
 
6.8%
i 6567
 
6.3%
n 6545
 
6.3%
r 6035
 
5.8%
o 5367
 
5.2%
, 5210
 
5.0%
l 4000
 
3.8%
s 3530
 
3.4%
Other values (63) 37017
35.6%
None
ValueCountFrequency (%)
é 14
33.3%
ö 5
 
11.9%
í 4
 
9.5%
ó 4
 
9.5%
á 2
 
4.8%
ï 2
 
4.8%
ô 1
 
2.4%
è 1
 
2.4%
à 1
 
2.4%
ä 1
 
2.4%
Other values (7) 7
16.7%

Operator
Categorical

Distinct2267
Distinct (%)45.4%
Missing10
Missing (%)0.2%
Memory size412.5 KiB
Aeroflot
 
253
Military - U.S. Air Force
 
141
Air France
 
74
Deutsche Lufthansa
 
63
United Air Lines
 
44
Other values (2262)
4423 

Length

Max length65
Median length47
Mean length18.957583
Min length3

Characters and Unicode

Total characters94750
Distinct characters87
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1734 ?
Unique (%)34.7%

Sample

1st rowMilitary - U.S. Army
2nd rowMilitary - U.S. Navy
3rd rowPrivate
4th rowMilitary - German Navy
5th rowMilitary - German Navy

Common Values

ValueCountFrequency (%)
Aeroflot 253
 
5.1%
Military - U.S. Air Force 141
 
2.8%
Air France 74
 
1.5%
Deutsche Lufthansa 63
 
1.3%
United Air Lines 44
 
0.9%
China National Aviation Corporation 43
 
0.9%
Military - U.S. Army Air Forces 43
 
0.9%
Pan American World Airways 41
 
0.8%
American Airlines 37
 
0.7%
US Aerial Mail Service 35
 
0.7%
Other values (2257) 4224
84.3%

Length

2023-05-22T10:51:25.413373image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
air 1481
 
10.3%
961
 
6.7%
airlines 840
 
5.8%
military 778
 
5.4%
force 557
 
3.9%
airways 453
 
3.1%
u.s 302
 
2.1%
aeroflot 265
 
1.8%
lines 184
 
1.3%
royal 152
 
1.1%
Other values (2079) 8422
58.5%

Most occurring characters

ValueCountFrequency (%)
i 10212
 
10.8%
9421
 
9.9%
r 8849
 
9.3%
a 7786
 
8.2%
e 6780
 
7.2%
n 5528
 
5.8%
A 5083
 
5.4%
o 4380
 
4.6%
l 4079
 
4.3%
s 4000
 
4.2%
Other values (77) 28632
30.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 68181
72.0%
Uppercase Letter 15071
 
15.9%
Space Separator 9422
 
9.9%
Dash Punctuation 939
 
1.0%
Other Punctuation 869
 
0.9%
Open Punctuation 115
 
0.1%
Close Punctuation 115
 
0.1%
Decimal Number 30
 
< 0.1%
Control 8
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 10212
15.0%
r 8849
13.0%
a 7786
11.4%
e 6780
9.9%
n 5528
8.1%
o 4380
6.4%
l 4079
 
6.0%
s 4000
 
5.9%
t 3921
 
5.8%
c 1996
 
2.9%
Other values (28) 10650
15.6%
Uppercase Letter
ValueCountFrequency (%)
A 5083
33.7%
M 1217
 
8.1%
S 1138
 
7.6%
C 910
 
6.0%
F 901
 
6.0%
T 679
 
4.5%
L 661
 
4.4%
U 534
 
3.5%
P 513
 
3.4%
N 496
 
3.3%
Other values (16) 2939
19.5%
Decimal Number
ValueCountFrequency (%)
0 5
16.7%
7 4
13.3%
4 4
13.3%
2 3
10.0%
5 3
10.0%
1 3
10.0%
8 2
 
6.7%
6 2
 
6.7%
9 2
 
6.7%
3 2
 
6.7%
Other Punctuation
ValueCountFrequency (%)
. 718
82.6%
/ 109
 
12.5%
' 25
 
2.9%
, 10
 
1.2%
& 6
 
0.7%
? 1
 
0.1%
Space Separator
ValueCountFrequency (%)
9421
> 99.9%
  1
 
< 0.1%
Control
ValueCountFrequency (%)
6
75.0%
2
 
25.0%
Dash Punctuation
ValueCountFrequency (%)
- 939
100.0%
Open Punctuation
ValueCountFrequency (%)
( 115
100.0%
Close Punctuation
ValueCountFrequency (%)
) 115
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 83252
87.9%
Common 11498
 
12.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 10212
12.3%
r 8849
 
10.6%
a 7786
 
9.4%
e 6780
 
8.1%
n 5528
 
6.6%
A 5083
 
6.1%
o 4380
 
5.3%
l 4079
 
4.9%
s 4000
 
4.8%
t 3921
 
4.7%
Other values (54) 22634
27.2%
Common
ValueCountFrequency (%)
9421
81.9%
- 939
 
8.2%
. 718
 
6.2%
( 115
 
1.0%
) 115
 
1.0%
/ 109
 
0.9%
' 25
 
0.2%
, 10
 
0.1%
6
 
0.1%
& 6
 
0.1%
Other values (13) 34
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 94627
99.9%
None 123
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 10212
 
10.8%
9421
 
10.0%
r 8849
 
9.4%
a 7786
 
8.2%
e 6780
 
7.2%
n 5528
 
5.8%
A 5083
 
5.4%
o 4380
 
4.6%
l 4079
 
4.3%
s 4000
 
4.2%
Other values (64) 28509
30.1%
None
ValueCountFrequency (%)
é 102
82.9%
á 5
 
4.1%
à 2
 
1.6%
í 2
 
1.6%
ó 2
 
1.6%
ç 2
 
1.6%
ï 2
 
1.6%
ã 1
 
0.8%
ú 1
 
0.8%
ê 1
 
0.8%
Other values (3) 3
 
2.4%

flight_no
Categorical

HIGH CARDINALITY  MISSING 

Distinct892
Distinct (%)67.3%
Missing3682
Missing (%)73.5%
Memory size232.2 KiB
-
 
36
1
 
11
101
 
9
6
 
7
4
 
7
Other values (887)
1256 

Length

Max length12
Median length3
Mean length3.2390649
Min length1

Characters and Unicode

Total characters4295
Distinct characters47
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique656 ?
Unique (%)49.5%

Sample

1st rowF-AIKG
2nd row7
3rd row599
4th row6
5th row6

Common Values

ValueCountFrequency (%)
- 36
 
0.7%
1 11
 
0.2%
101 9
 
0.2%
6 7
 
0.1%
4 7
 
0.1%
901 7
 
0.1%
115 6
 
0.1%
301 6
 
0.1%
201 6
 
0.1%
703 6
 
0.1%
Other values (882) 1225
 
24.5%
(Missing) 3682
73.5%

Length

2023-05-22T10:51:25.539712image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
46
 
3.4%
1 11
 
0.8%
101 10
 
0.7%
6 8
 
0.6%
4 7
 
0.5%
901 7
 
0.5%
115 6
 
0.4%
301 6
 
0.4%
201 6
 
0.4%
703 6
 
0.4%
Other values (883) 1235
91.6%

Most occurring characters

ValueCountFrequency (%)
1 638
14.9%
0 497
11.6%
2 495
11.5%
3 417
9.7%
5 385
9.0%
4 347
8.1%
6 330
7.7%
7 316
7.4%
8 291
6.8%
9 270
6.3%
Other values (37) 309
7.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3986
92.8%
Uppercase Letter 156
 
3.6%
Dash Punctuation 87
 
2.0%
Other Punctuation 33
 
0.8%
Space Separator 22
 
0.5%
Lowercase Letter 11
 
0.3%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 21
13.5%
S 14
 
9.0%
H 13
 
8.3%
P 11
 
7.1%
C 10
 
6.4%
F 10
 
6.4%
U 8
 
5.1%
R 7
 
4.5%
I 7
 
4.5%
L 7
 
4.5%
Other values (15) 48
30.8%
Decimal Number
ValueCountFrequency (%)
1 638
16.0%
0 497
12.5%
2 495
12.4%
3 417
10.5%
5 385
9.7%
4 347
8.7%
6 330
8.3%
7 316
7.9%
8 291
7.3%
9 270
6.8%
Lowercase Letter
ValueCountFrequency (%)
r 2
18.2%
a 2
18.2%
n 2
18.2%
y 1
9.1%
h 1
9.1%
t 1
9.1%
e 1
9.1%
o 1
9.1%
Other Punctuation
ValueCountFrequency (%)
/ 32
97.0%
? 1
 
3.0%
Dash Punctuation
ValueCountFrequency (%)
- 87
100.0%
Space Separator
ValueCountFrequency (%)
22
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4128
96.1%
Latin 167
 
3.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 21
 
12.6%
S 14
 
8.4%
H 13
 
7.8%
P 11
 
6.6%
C 10
 
6.0%
F 10
 
6.0%
U 8
 
4.8%
R 7
 
4.2%
I 7
 
4.2%
L 7
 
4.2%
Other values (23) 59
35.3%
Common
ValueCountFrequency (%)
1 638
15.5%
0 497
12.0%
2 495
12.0%
3 417
10.1%
5 385
9.3%
4 347
8.4%
6 330
8.0%
7 316
7.7%
8 291
7.0%
9 270
6.5%
Other values (4) 142
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4295
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 638
14.9%
0 497
11.6%
2 495
11.5%
3 417
9.7%
5 385
9.0%
4 347
8.1%
6 330
7.7%
7 316
7.4%
8 291
6.8%
9 270
6.3%
Other values (37) 309
7.2%

Route
Categorical

HIGH CARDINALITY  MISSING 

Distinct3838
Distinct (%)90.4%
Missing762
Missing (%)15.2%
Memory size394.0 KiB
Training
 
96
Sightseeing
 
31
Test flight
 
23
Sao Paulo - Rio de Janeiro
 
7
Test
 
6
Other values (3833)
4083 

Length

Max length59
Median length51
Mean length22.166039
Min length4

Characters and Unicode

Total characters94117
Distinct characters92
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3631 ?
Unique (%)85.5%

Sample

1st rowDemonstration
2nd rowAir show
3rd rowTest flight
4th rowShuttle
5th rowVenice Taliedo

Common Values

ValueCountFrequency (%)
Training 96
 
1.9%
Sightseeing 31
 
0.6%
Test flight 23
 
0.5%
Sao Paulo - Rio de Janeiro 7
 
0.1%
Test 6
 
0.1%
Rio de Janeiro - Sao Paulo 5
 
0.1%
Huambo - Luanda 4
 
0.1%
Paris - London 4
 
0.1%
Barranquilla - Bogota 4
 
0.1%
Croydon - Paris 4
 
0.1%
Other values (3828) 4062
81.1%
(Missing) 762
 
15.2%

Length

2023-05-22T10:51:25.670752image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
4633
27.5%
city 213
 
1.3%
new 149
 
0.9%
san 140
 
0.8%
york 117
 
0.7%
paris 116
 
0.7%
training 103
 
0.6%
de 101
 
0.6%
london 88
 
0.5%
moscow 84
 
0.5%
Other values (3628) 11083
65.9%

Most occurring characters

ValueCountFrequency (%)
12646
 
13.4%
a 9833
 
10.4%
n 5569
 
5.9%
o 5504
 
5.8%
i 5244
 
5.6%
e 5108
 
5.4%
- 4927
 
5.2%
r 4487
 
4.8%
l 3420
 
3.6%
s 3075
 
3.3%
Other values (82) 34304
36.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 62477
66.4%
Uppercase Letter 12942
 
13.8%
Space Separator 12647
 
13.4%
Dash Punctuation 4931
 
5.2%
Other Punctuation 1065
 
1.1%
Control 30
 
< 0.1%
Decimal Number 16
 
< 0.1%
Final Punctuation 4
 
< 0.1%
Open Punctuation 3
 
< 0.1%
Close Punctuation 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 9833
15.7%
n 5569
 
8.9%
o 5504
 
8.8%
i 5244
 
8.4%
e 5108
 
8.2%
r 4487
 
7.2%
l 3420
 
5.5%
s 3075
 
4.9%
t 3006
 
4.8%
u 2566
 
4.1%
Other values (30) 14665
23.5%
Uppercase Letter
ValueCountFrequency (%)
C 1232
 
9.5%
B 1140
 
8.8%
S 1081
 
8.4%
A 1046
 
8.1%
M 1042
 
8.1%
P 823
 
6.4%
L 788
 
6.1%
T 710
 
5.5%
K 640
 
4.9%
N 631
 
4.9%
Other values (18) 3809
29.4%
Decimal Number
ValueCountFrequency (%)
9 3
18.8%
4 3
18.8%
1 3
18.8%
7 2
12.5%
2 2
12.5%
8 1
 
6.2%
6 1
 
6.2%
0 1
 
6.2%
Other Punctuation
ValueCountFrequency (%)
, 915
85.9%
. 98
 
9.2%
/ 20
 
1.9%
' 20
 
1.9%
? 6
 
0.6%
: 5
 
0.5%
\ 1
 
0.1%
Space Separator
ValueCountFrequency (%)
12646
> 99.9%
  1
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 4927
99.9%
– 4
 
0.1%
Control
ValueCountFrequency (%)
29
96.7%
1
 
3.3%
Final Punctuation
ValueCountFrequency (%)
’ 4
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 75419
80.1%
Common 18698
 
19.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 9833
 
13.0%
n 5569
 
7.4%
o 5504
 
7.3%
i 5244
 
7.0%
e 5108
 
6.8%
r 4487
 
5.9%
l 3420
 
4.5%
s 3075
 
4.1%
t 3006
 
4.0%
u 2566
 
3.4%
Other values (58) 27607
36.6%
Common
ValueCountFrequency (%)
12646
67.6%
- 4927
 
26.4%
, 915
 
4.9%
. 98
 
0.5%
29
 
0.2%
/ 20
 
0.1%
' 20
 
0.1%
? 6
 
< 0.1%
: 5
 
< 0.1%
’ 4
 
< 0.1%
Other values (14) 28
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 93988
99.9%
None 121
 
0.1%
Punctuation 8
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
12646
 
13.5%
a 9833
 
10.5%
n 5569
 
5.9%
o 5504
 
5.9%
i 5244
 
5.6%
e 5108
 
5.4%
- 4927
 
5.2%
r 4487
 
4.8%
l 3420
 
3.6%
s 3075
 
3.3%
Other values (63) 34175
36.4%
None
ValueCountFrequency (%)
é 38
31.4%
í 21
17.4%
á 15
 
12.4%
ó 14
 
11.6%
ã 6
 
5.0%
ü 6
 
5.0%
ç 4
 
3.3%
è 4
 
3.3%
ÃŽ 3
 
2.5%
ö 2
 
1.7%
Other values (7) 8
 
6.6%
Punctuation
ValueCountFrequency (%)
’ 4
50.0%
– 4
50.0%

Ac Type
Categorical

Distinct2468
Distinct (%)49.4%
Missing13
Missing (%)0.3%
Memory size408.4 KiB
Douglas DC-3
 
333
de Havilland Canada DHC-6 Twin Otter 300
 
81
Douglas C-47A
 
70
Douglas C-47
 
64
Douglas DC-4
 
41
Other values (2463)
4406 

Length

Max length42
Median length36
Mean length18.541542
Min length4

Characters and Unicode

Total characters92615
Distinct characters77
Distinct categories12 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1863 ?
Unique (%)37.3%

Sample

1st rowWright Flyer III
2nd rowWright Byplane
3rd rowDirigible
4th rowCurtiss seaplane
5th rowZeppelin L-1 (airship)

Common Values

ValueCountFrequency (%)
Douglas DC-3 333
 
6.6%
de Havilland Canada DHC-6 Twin Otter 300 81
 
1.6%
Douglas C-47A 70
 
1.4%
Douglas C-47 64
 
1.3%
Douglas DC-4 41
 
0.8%
Antonov AN-26 35
 
0.7%
Yakovlev YAK-40 35
 
0.7%
Junkers JU-52/3m 30
 
0.6%
De Havilland DH-4 27
 
0.5%
Douglas C-47B 27
 
0.5%
Other values (2458) 4252
84.9%

Length

2023-05-22T10:51:25.829618image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
douglas 1130
 
8.3%
boeing 418
 
3.1%
dc-3 387
 
2.8%
lockheed 332
 
2.4%
de 294
 
2.2%
havilland 292
 
2.1%
antonov 288
 
2.1%
canada 159
 
1.2%
otter 146
 
1.1%
ilyushin 142
 
1.0%
Other values (2525) 10025
73.6%

Most occurring characters

ValueCountFrequency (%)
8649
 
9.3%
- 5180
 
5.6%
e 4842
 
5.2%
o 4638
 
5.0%
a 4636
 
5.0%
n 3856
 
4.2%
l 3696
 
4.0%
i 3486
 
3.8%
r 3306
 
3.6%
C 3034
 
3.3%
Other values (67) 47292
51.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 46427
50.1%
Uppercase Letter 17900
 
19.3%
Decimal Number 13808
 
14.9%
Space Separator 8650
 
9.3%
Dash Punctuation 5180
 
5.6%
Other Punctuation 264
 
0.3%
Open Punctuation 190
 
0.2%
Close Punctuation 189
 
0.2%
Math Symbol 3
 
< 0.1%
Control 2
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 4842
10.4%
o 4638
10.0%
a 4636
10.0%
n 3856
 
8.3%
l 3696
 
8.0%
i 3486
 
7.5%
r 3306
 
7.1%
s 2917
 
6.3%
t 2357
 
5.1%
u 2217
 
4.8%
Other values (18) 10476
22.6%
Uppercase Letter
ValueCountFrequency (%)
C 3034
16.9%
D 2819
15.7%
A 1901
10.6%
B 1728
9.7%
H 1016
 
5.7%
L 883
 
4.9%
F 796
 
4.4%
S 790
 
4.4%
I 642
 
3.6%
T 620
 
3.5%
Other values (16) 3671
20.5%
Decimal Number
ValueCountFrequency (%)
2 2167
15.7%
0 2103
15.2%
1 2017
14.6%
3 1706
12.4%
4 1704
12.3%
7 1494
10.8%
6 875
6.3%
5 713
 
5.2%
8 664
 
4.8%
9 365
 
2.6%
Other Punctuation
ValueCountFrequency (%)
/ 185
70.1%
. 76
28.8%
, 2
 
0.8%
& 1
 
0.4%
Space Separator
ValueCountFrequency (%)
8649
> 99.9%
  1
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 5180
100.0%
Open Punctuation
ValueCountFrequency (%)
( 190
100.0%
Close Punctuation
ValueCountFrequency (%)
) 189
100.0%
Math Symbol
ValueCountFrequency (%)
+ 3
100.0%
Control
ValueCountFrequency (%)
2
100.0%
Initial Punctuation
ValueCountFrequency (%)
‘ 1
100.0%
Final Punctuation
ValueCountFrequency (%)
’ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 64327
69.5%
Common 28288
30.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 4842
 
7.5%
o 4638
 
7.2%
a 4636
 
7.2%
n 3856
 
6.0%
l 3696
 
5.7%
i 3486
 
5.4%
r 3306
 
5.1%
C 3034
 
4.7%
s 2917
 
4.5%
D 2819
 
4.4%
Other values (44) 27097
42.1%
Common
ValueCountFrequency (%)
8649
30.6%
- 5180
18.3%
2 2167
 
7.7%
0 2103
 
7.4%
1 2017
 
7.1%
3 1706
 
6.0%
4 1704
 
6.0%
7 1494
 
5.3%
6 875
 
3.1%
5 713
 
2.5%
Other values (13) 1680
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 92596
> 99.9%
None 17
 
< 0.1%
Punctuation 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8649
 
9.3%
- 5180
 
5.6%
e 4842
 
5.2%
o 4638
 
5.0%
a 4636
 
5.0%
n 3856
 
4.2%
l 3696
 
4.0%
i 3486
 
3.8%
r 3306
 
3.6%
C 3034
 
3.3%
Other values (62) 47273
51.1%
None
ValueCountFrequency (%)
é 12
70.6%
è 4
 
23.5%
  1
 
5.9%
Punctuation
ValueCountFrequency (%)
‘ 1
50.0%
’ 1
50.0%

Registration
Categorical

HIGH CARDINALITY  MISSING  UNIFORM 

Distinct4700
Distinct (%)99.2%
Missing272
Missing (%)5.4%
Memory size341.3 KiB
49
 
3
SU-AFK
 
2
2
 
2
19
 
2
CCCP-45012
 
2
Other values (4695)
4725 

Length

Max length15
Median length6
Mean length6.4940878
Min length1

Characters and Unicode

Total characters30756
Distinct characters49
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4665 ?
Unique (%)98.5%

Sample

1st rowSC1
2nd rowL-48
3rd row97
4th row61
5th row82

Common Values

ValueCountFrequency (%)
49 3
 
0.1%
SU-AFK 2
 
< 0.1%
2 2
 
< 0.1%
19 2
 
< 0.1%
CCCP-45012 2
 
< 0.1%
101 2
 
< 0.1%
G-ADUZ 2
 
< 0.1%
VH-ABB 2
 
< 0.1%
OK-MCT 2
 
< 0.1%
I-BAUS 2
 
< 0.1%
Other values (4690) 4715
94.1%
(Missing) 272
 
5.4%

Length

2023-05-22T10:51:25.976714image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
39
 
0.8%
hk 4
 
0.1%
49 3
 
0.1%
cccp 2
 
< 0.1%
82 2
 
< 0.1%
53 2
 
< 0.1%
cf-tcl 2
 
< 0.1%
12406 2
 
< 0.1%
f-bbdm 2
 
< 0.1%
204 2
 
< 0.1%
Other values (4732) 4772
98.8%

Most occurring characters

ValueCountFrequency (%)
- 3497
 
11.4%
C 2022
 
6.6%
A 1711
 
5.6%
1 1541
 
5.0%
N 1432
 
4.7%
2 1246
 
4.1%
P 1193
 
3.9%
4 1187
 
3.9%
5 1132
 
3.7%
0 1098
 
3.6%
Other values (39) 14697
47.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 15946
51.8%
Decimal Number 11081
36.0%
Dash Punctuation 3497
 
11.4%
Other Punctuation 119
 
0.4%
Space Separator 90
 
0.3%
Control 12
 
< 0.1%
Lowercase Letter 10
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 2022
 
12.7%
A 1711
 
10.7%
N 1432
 
9.0%
P 1193
 
7.5%
B 718
 
4.5%
F 690
 
4.3%
H 636
 
4.0%
T 611
 
3.8%
E 560
 
3.5%
G 559
 
3.5%
Other values (16) 5814
36.5%
Decimal Number
ValueCountFrequency (%)
1 1541
13.9%
2 1246
11.2%
4 1187
10.7%
5 1132
10.2%
0 1098
9.9%
3 1037
9.4%
6 1026
9.3%
7 1015
9.2%
8 912
8.2%
9 887
8.0%
Lowercase Letter
ValueCountFrequency (%)
l 5
50.0%
y 1
 
10.0%
e 1
 
10.0%
o 1
 
10.0%
w 1
 
10.0%
d 1
 
10.0%
Other Punctuation
ValueCountFrequency (%)
/ 114
95.8%
? 5
 
4.2%
Control
ValueCountFrequency (%)
10
83.3%
2
 
16.7%
Dash Punctuation
ValueCountFrequency (%)
- 3497
100.0%
Space Separator
ValueCountFrequency (%)
90
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 15956
51.9%
Common 14800
48.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 2022
 
12.7%
A 1711
 
10.7%
N 1432
 
9.0%
P 1193
 
7.5%
B 718
 
4.5%
F 690
 
4.3%
H 636
 
4.0%
T 611
 
3.8%
E 560
 
3.5%
G 559
 
3.5%
Other values (22) 5824
36.5%
Common
ValueCountFrequency (%)
- 3497
23.6%
1 1541
10.4%
2 1246
 
8.4%
4 1187
 
8.0%
5 1132
 
7.6%
0 1098
 
7.4%
3 1037
 
7.0%
6 1026
 
6.9%
7 1015
 
6.9%
8 912
 
6.2%
Other values (7) 1109
 
7.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 30756
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 3497
 
11.4%
C 2022
 
6.6%
A 1711
 
5.6%
1 1541
 
5.0%
N 1432
 
4.7%
2 1246
 
4.1%
P 1193
 
3.9%
4 1187
 
3.9%
5 1132
 
3.7%
0 1098
 
3.6%
Other values (39) 14697
47.8%

cn_ln
Categorical

HIGH CARDINALITY  MISSING  UNIFORM 

Distinct3907
Distinct (%)90.0%
Missing667
Missing (%)13.3%
Memory size325.0 KiB
1
 
8
4
 
8
125
 
7
229
 
6
178
 
5
Other values (3902)
4307 

Length

Max length22
Median length19
Mean length5.5268371
Min length1

Characters and Unicode

Total characters23992
Distinct characters44
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3608 ?
Unique (%)83.1%

Sample

1st row1
2nd row77
3rd row31
4th row20
5th row178

Common Values

ValueCountFrequency (%)
1 8
 
0.2%
4 8
 
0.2%
125 7
 
0.1%
229 6
 
0.1%
178 5
 
0.1%
3 5
 
0.1%
160 5
 
0.1%
303 5
 
0.1%
213 5
 
0.1%
2 5
 
0.1%
Other values (3897) 4282
85.5%
(Missing) 667
 
13.3%

Length

2023-05-22T10:51:26.103668image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
57
 
1.3%
1 10
 
0.2%
4 9
 
0.2%
3 7
 
0.2%
30 7
 
0.2%
125 7
 
0.2%
229 6
 
0.1%
160 5
 
0.1%
303 5
 
0.1%
213 5
 
0.1%
Other values (3928) 4359
97.4%

Most occurring characters

ValueCountFrequency (%)
1 3485
14.5%
0 3141
13.1%
2 2641
11.0%
4 2366
9.9%
3 2343
9.8%
5 1861
7.8%
6 1593
6.6%
9 1582
6.6%
7 1577
6.6%
8 1537
6.4%
Other values (34) 1866
7.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 22126
92.2%
Other Punctuation 713
 
3.0%
Uppercase Letter 582
 
2.4%
Dash Punctuation 430
 
1.8%
Space Separator 136
 
0.6%
Control 3
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 125
21.5%
B 65
11.2%
C 62
10.7%
S 55
9.5%
T 45
 
7.7%
H 32
 
5.5%
U 26
 
4.5%
N 20
 
3.4%
G 20
 
3.4%
D 17
 
2.9%
Other values (14) 115
19.8%
Decimal Number
ValueCountFrequency (%)
1 3485
15.8%
0 3141
14.2%
2 2641
11.9%
4 2366
10.7%
3 2343
10.6%
5 1861
8.4%
6 1593
7.2%
9 1582
7.1%
7 1577
7.1%
8 1537
6.9%
Other Punctuation
ValueCountFrequency (%)
/ 699
98.0%
? 12
 
1.7%
. 1
 
0.1%
: 1
 
0.1%
Control
ValueCountFrequency (%)
2
66.7%
1
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 430
100.0%
Space Separator
ValueCountFrequency (%)
136
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 23410
97.6%
Latin 582
 
2.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 125
21.5%
B 65
11.2%
C 62
10.7%
S 55
9.5%
T 45
 
7.7%
H 32
 
5.5%
U 26
 
4.5%
N 20
 
3.4%
G 20
 
3.4%
D 17
 
2.9%
Other values (14) 115
19.8%
Common
ValueCountFrequency (%)
1 3485
14.9%
0 3141
13.4%
2 2641
11.3%
4 2366
10.1%
3 2343
10.0%
5 1861
7.9%
6 1593
6.8%
9 1582
6.8%
7 1577
6.7%
8 1537
6.6%
Other values (10) 1284
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23992
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 3485
14.5%
0 3141
13.1%
2 2641
11.0%
4 2366
9.9%
3 2343
9.8%
5 1861
7.8%
6 1593
6.6%
9 1582
6.6%
7 1577
6.6%
8 1537
6.4%
Other values (34) 1866
7.8%

All Aboard
Real number (ℝ)

Distinct244
Distinct (%)4.9%
Missing17
Missing (%)0.3%
Infinite0
Infinite (%)0.0%
Mean31.121218
Minimum0
Maximum644
Zeros5
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2023-05-22T10:51:26.224648image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q17
median16
Q335
95-th percentile117.5
Maximum644
Range644
Interquartile range (IQR)28

Descriptive statistics

Standard deviation45.479965
Coefficient of variation (CV)1.4613812
Kurtosis23.9531
Mean31.121218
Median Absolute Deviation (MAD)11
Skewness3.9209027
Sum155326
Variance2068.4272
MonotonicityNot monotonic
2023-05-22T10:51:26.339776image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3 280
 
5.6%
2 246
 
4.9%
4 202
 
4.0%
5 190
 
3.8%
10 179
 
3.6%
6 174
 
3.5%
7 164
 
3.3%
1 139
 
2.8%
9 130
 
2.6%
11 128
 
2.6%
Other values (234) 3159
63.1%
ValueCountFrequency (%)
0 5
 
0.1%
1 139
2.8%
2 246
4.9%
3 280
5.6%
4 202
4.0%
5 190
3.8%
6 174
3.5%
7 164
3.3%
8 119
2.4%
9 130
2.6%
ValueCountFrequency (%)
644 1
< 0.1%
524 1
< 0.1%
517 1
< 0.1%
394 1
< 0.1%
393 1
< 0.1%
384 1
< 0.1%
356 1
< 0.1%
349 1
< 0.1%
346 1
< 0.1%
340 1
< 0.1%

Passengers Board
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct234
Distinct (%)4.9%
Missing221
Missing (%)4.4%
Infinite0
Infinite (%)0.0%
Mean26.877376
Minimum0
Maximum614
Zeros869
Zeros (%)17.4%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2023-05-22T10:51:26.476076image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q13
median12
Q330
95-th percentile111.7
Maximum614
Range614
Interquartile range (IQR)27

Descriptive statistics

Standard deviation44.035342
Coefficient of variation (CV)1.6383795
Kurtosis24.184745
Mean26.877376
Median Absolute Deviation (MAD)11
Skewness3.9364199
Sum128662
Variance1939.1114
MonotonicityNot monotonic
2023-05-22T10:51:26.594677image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 869
 
17.4%
4 170
 
3.4%
2 162
 
3.2%
5 140
 
2.8%
7 130
 
2.6%
3 130
 
2.6%
10 128
 
2.6%
9 128
 
2.6%
8 126
 
2.5%
1 120
 
2.4%
Other values (224) 2684
53.6%
(Missing) 221
 
4.4%
ValueCountFrequency (%)
0 869
17.4%
1 120
 
2.4%
2 162
 
3.2%
3 130
 
2.6%
4 170
 
3.4%
5 140
 
2.8%
6 109
 
2.2%
7 130
 
2.6%
8 126
 
2.5%
9 128
 
2.6%
ValueCountFrequency (%)
614 1
< 0.1%
509 1
< 0.1%
503 1
< 0.1%
381 1
< 0.1%
374 1
< 0.1%
364 1
< 0.1%
338 1
< 0.1%
335 1
< 0.1%
327 1
< 0.1%
316 1
< 0.1%

Crew Board
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct34
Distinct (%)0.7%
Missing219
Missing (%)4.4%
Infinite0
Infinite (%)0.0%
Mean4.5195239
Minimum0
Maximum83
Zeros7
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2023-05-22T10:51:26.713636image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q12
median4
Q36
95-th percentile11
Maximum83
Range83
Interquartile range (IQR)4

Descriptive statistics

Standard deviation3.7580719
Coefficient of variation (CV)0.83151942
Kurtosis62.869559
Mean4.5195239
Median Absolute Deviation (MAD)2
Skewness4.9609299
Sum21644
Variance14.123105
MonotonicityNot monotonic
2023-05-22T10:51:26.824474image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=34)
ValueCountFrequency (%)
3 954
19.0%
2 828
16.5%
4 694
13.9%
1 535
10.7%
5 514
10.3%
6 375
 
7.5%
7 244
 
4.9%
8 173
 
3.5%
9 115
 
2.3%
10 94
 
1.9%
Other values (24) 263
 
5.3%
(Missing) 219
 
4.4%
ValueCountFrequency (%)
0 7
 
0.1%
1 535
10.7%
2 828
16.5%
3 954
19.0%
4 694
13.9%
5 514
10.3%
6 375
 
7.5%
7 244
 
4.9%
8 173
 
3.5%
9 115
 
2.3%
ValueCountFrequency (%)
83 1
 
< 0.1%
61 1
 
< 0.1%
49 1
 
< 0.1%
43 1
 
< 0.1%
41 1
 
< 0.1%
33 1
 
< 0.1%
31 1
 
< 0.1%
30 1
 
< 0.1%
27 1
 
< 0.1%
25 4
0.1%

All Fatalities
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct199
Distinct (%)4.0%
Missing8
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean22.294
Minimum0
Maximum583
Zeros76
Zeros (%)1.5%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2023-05-22T10:51:26.947376image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q14
median11
Q325
95-th percentile85
Maximum583
Range583
Interquartile range (IQR)21

Descriptive statistics

Standard deviation35.000385
Coefficient of variation (CV)1.5699464
Kurtosis36.856285
Mean22.294
Median Absolute Deviation (MAD)9
Skewness4.6222207
Sum111470
Variance1225.027
MonotonicityNot monotonic
2023-05-22T10:51:27.063635image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 384
 
7.7%
2 377
 
7.5%
3 363
 
7.2%
4 242
 
4.8%
5 235
 
4.7%
6 176
 
3.5%
7 160
 
3.2%
10 159
 
3.2%
13 132
 
2.6%
9 128
 
2.6%
Other values (189) 2644
52.8%
ValueCountFrequency (%)
0 76
 
1.5%
1 384
7.7%
2 377
7.5%
3 363
7.2%
4 242
4.8%
5 235
4.7%
6 176
3.5%
7 160
3.2%
8 128
 
2.6%
9 128
 
2.6%
ValueCountFrequency (%)
583 1
< 0.1%
520 1
< 0.1%
349 1
< 0.1%
346 1
< 0.1%
329 1
< 0.1%
301 1
< 0.1%
298 1
< 0.1%
290 1
< 0.1%
275 1
< 0.1%
271 1
< 0.1%

Passenger Fatalities
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct190
Distinct (%)4.0%
Missing235
Missing (%)4.7%
Infinite0
Infinite (%)0.0%
Mean18.940708
Minimum0
Maximum560
Zeros1040
Zeros (%)20.8%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2023-05-22T10:51:27.390671image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median7
Q321
95-th percentile81
Maximum560
Range560
Interquartile range (IQR)20

Descriptive statistics

Standard deviation34.06519
Coefficient of variation (CV)1.7985172
Kurtosis36.950994
Mean18.940708
Median Absolute Deviation (MAD)7
Skewness4.6462228
Sum90404
Variance1160.4372
MonotonicityNot monotonic
2023-05-22T10:51:27.500847image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1040
20.8%
1 308
 
6.2%
2 263
 
5.3%
3 193
 
3.9%
4 185
 
3.7%
5 139
 
2.8%
6 133
 
2.7%
7 126
 
2.5%
8 126
 
2.5%
9 118
 
2.4%
Other values (180) 2142
42.8%
(Missing) 235
 
4.7%
ValueCountFrequency (%)
0 1040
20.8%
1 308
 
6.2%
2 263
 
5.3%
3 193
 
3.9%
4 185
 
3.7%
5 139
 
2.8%
6 133
 
2.7%
7 126
 
2.5%
8 126
 
2.5%
9 118
 
2.4%
ValueCountFrequency (%)
560 1
< 0.1%
505 1
< 0.1%
335 1
< 0.1%
316 1
< 0.1%
307 1
< 0.1%
287 1
< 0.1%
283 1
< 0.1%
278 1
< 0.1%
258 1
< 0.1%
257 1
< 0.1%

Crew fatalities
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct28
Distinct (%)0.6%
Missing235
Missing (%)4.7%
Infinite0
Infinite (%)0.0%
Mean3.5872617
Minimum0
Maximum43
Zeros400
Zeros (%)8.0%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2023-05-22T10:51:27.609527image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median3
Q35
95-th percentile9
Maximum43
Range43
Interquartile range (IQR)3

Descriptive statistics

Standard deviation3.1773146
Coefficient of variation (CV)0.88572145
Kurtosis12.865271
Mean3.5872617
Median Absolute Deviation (MAD)2
Skewness2.4985213
Sum17122
Variance10.095328
MonotonicityNot monotonic
2023-05-22T10:51:27.717784image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=28)
ValueCountFrequency (%)
2 892
17.8%
3 824
16.5%
1 771
15.4%
4 591
11.8%
5 402
8.0%
0 400
8.0%
6 273
 
5.5%
7 171
 
3.4%
8 130
 
2.6%
9 87
 
1.7%
Other values (18) 232
 
4.6%
(Missing) 235
 
4.7%
ValueCountFrequency (%)
0 400
8.0%
1 771
15.4%
2 892
17.8%
3 824
16.5%
4 591
11.8%
5 402
8.0%
6 273
 
5.5%
7 171
 
3.4%
8 130
 
2.6%
9 87
 
1.7%
ValueCountFrequency (%)
43 1
 
< 0.1%
33 1
 
< 0.1%
27 1
 
< 0.1%
25 2
 
< 0.1%
23 6
0.1%
22 5
0.1%
21 2
 
< 0.1%
20 3
0.1%
19 5
0.1%
18 3
0.1%

Ground
Real number (ℝ)

SKEWED  ZEROS 

Distinct51
Distinct (%)1.0%
Missing44
Missing (%)0.9%
Infinite0
Infinite (%)0.0%
Mean1.7183723
Minimum0
Maximum2750
Zeros4716
Zeros (%)94.2%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2023-05-22T10:51:27.828893image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum2750
Range2750
Interquartile range (IQR)0

Descriptive statistics

Standard deviation55.495544
Coefficient of variation (CV)32.295414
Kurtosis2423.809
Mean1.7183723
Median Absolute Deviation (MAD)0
Skewness48.986786
Sum8530
Variance3079.7554
MonotonicityNot monotonic
2023-05-22T10:51:27.942299image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 4716
94.2%
1 63
 
1.3%
2 34
 
0.7%
3 21
 
0.4%
4 16
 
0.3%
5 12
 
0.2%
7 10
 
0.2%
8 9
 
0.2%
10 6
 
0.1%
6 6
 
0.1%
Other values (41) 71
 
1.4%
(Missing) 44
 
0.9%
ValueCountFrequency (%)
0 4716
94.2%
1 63
 
1.3%
2 34
 
0.7%
3 21
 
0.4%
4 16
 
0.3%
5 12
 
0.2%
6 6
 
0.1%
7 10
 
0.2%
8 9
 
0.2%
9 1
 
< 0.1%
ValueCountFrequency (%)
2750 2
< 0.1%
225 1
< 0.1%
125 2
< 0.1%
113 1
< 0.1%
87 1
< 0.1%
85 1
< 0.1%
78 1
< 0.1%
71 1
< 0.1%
63 1
< 0.1%
58 1
< 0.1%

Summary
Categorical

HIGH CARDINALITY  MISSING  UNIFORM 

Distinct4857
Distinct (%)98.1%
Missing59
Missing (%)1.2%
Memory size1.4 MiB
Crashed under unknown circumstances.
 
9
Crashed while en route.
 
8
Crashed while attempting to land.
 
7
Crashed during takeoff.
 
6
Crashed into the sea.
 
5
Other values (4852)
4914 

Length

Max length2669
Median length787
Mean length223.39382
Min length8

Characters and Unicode

Total characters1105576
Distinct characters101
Distinct categories14 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4813 ?
Unique (%)97.3%

Sample

1st rowDuring a demonstration flight, a U.S. Army flyer flown by Orville Wright nose-dived into the ground from a height of approximately 75 feet, killing Lt. Thomas E. Selfridge, 26, who was a passenger. This was the first recorded airplane fatality in history. One of two propellers separated in flight, tearing loose the wires bracing the rudder and causing the loss of control of the aircraft. Orville Wright suffered broken ribs, pelvis and a leg. Selfridge suffered a crushed skull and died a short time later.
2nd rowEugene Lefebvre was the first pilot to ever be killed in an air accident, after his controls jambed while flying in an air show.
3rd rowFirst U.S. dirigible Akron exploded just offshore at an altitude of 1,000 ft. during a test flight.
4th rowThe first fatal airplane accident in Canada occurred when American barnstormer, John M. Bryant, California aviator was killed.
5th rowThe airship flew into a thunderstorm and encountered a severe downdraft crashing 20 miles north of Helgoland Island into the sea. The ship broke in two and the control car immediately sank drowning its occupants.

Common Values

ValueCountFrequency (%)
Crashed under unknown circumstances. 9
 
0.2%
Crashed while en route. 8
 
0.2%
Crashed while attempting to land. 7
 
0.1%
Crashed during takeoff. 6
 
0.1%
Crashed into the sea. 5
 
0.1%
Crashed shortly after taking off. 5
 
0.1%
Crashed on takeoff. 4
 
0.1%
Shot down by rebel forces. 4
 
0.1%
Crashed under unknown circumstances 4
 
0.1%
Crashed en route. 4
 
0.1%
Other values (4847) 4893
97.7%
(Missing) 59
 
1.2%

Length

2023-05-22T10:51:28.071958image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
the 18463
 
10.1%
of 5544
 
3.0%
a 5456
 
3.0%
and 5444
 
3.0%
to 5429
 
3.0%
in 3682
 
2.0%
crashed 3386
 
1.8%
was 2779
 
1.5%
aircraft 2557
 
1.4%
into 2360
 
1.3%
Other values (11568) 127976
69.9%

Most occurring characters

ValueCountFrequency (%)
179362
16.2%
e 104905
 
9.5%
t 81905
 
7.4%
a 79924
 
7.2%
n 68116
 
6.2%
i 65870
 
6.0%
r 63437
 
5.7%
o 62600
 
5.7%
h 42794
 
3.9%
s 39810
 
3.6%
Other values (91) 316853
28.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 869373
78.6%
Space Separator 179369
 
16.2%
Uppercase Letter 25294
 
2.3%
Other Punctuation 20624
 
1.9%
Decimal Number 8853
 
0.8%
Dash Punctuation 1645
 
0.1%
Close Punctuation 158
 
< 0.1%
Open Punctuation 140
 
< 0.1%
Final Punctuation 67
 
< 0.1%
Control 33
 
< 0.1%
Other values (4) 20
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 104905
12.1%
t 81905
 
9.4%
a 79924
 
9.2%
n 68116
 
7.8%
i 65870
 
7.6%
r 63437
 
7.3%
o 62600
 
7.2%
h 42794
 
4.9%
s 39810
 
4.6%
d 38411
 
4.4%
Other values (30) 221601
25.5%
Uppercase Letter
ValueCountFrequency (%)
T 5796
22.9%
C 2775
11.0%
A 2579
10.2%
S 1531
 
6.1%
F 1286
 
5.1%
M 1207
 
4.8%
I 1063
 
4.2%
P 960
 
3.8%
W 924
 
3.7%
N 861
 
3.4%
Other values (16) 6312
25.0%
Other Punctuation
ValueCountFrequency (%)
. 13487
65.4%
, 5721
27.7%
' 771
 
3.7%
" 362
 
1.8%
/ 170
 
0.8%
: 56
 
0.3%
; 34
 
0.2%
& 17
 
0.1%
% 3
 
< 0.1%
# 2
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
0 2668
30.1%
1 1368
15.5%
2 1042
 
11.8%
5 830
 
9.4%
3 820
 
9.3%
4 578
 
6.5%
6 432
 
4.9%
7 416
 
4.7%
8 386
 
4.4%
9 313
 
3.5%
Space Separator
ValueCountFrequency (%)
179362
> 99.9%
  7
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 157
99.4%
] 1
 
0.6%
Open Punctuation
ValueCountFrequency (%)
( 139
99.3%
[ 1
 
0.7%
Control
ValueCountFrequency (%)
32
97.0%
1
 
3.0%
Dash Punctuation
ValueCountFrequency (%)
- 1645
100.0%
Final Punctuation
ValueCountFrequency (%)
’ 67
100.0%
Math Symbol
ValueCountFrequency (%)
+ 7
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 7
100.0%
Other Symbol
ValueCountFrequency (%)
° 3
100.0%
Initial Punctuation
ValueCountFrequency (%)
‘ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 894667
80.9%
Common 210909
 
19.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 104905
11.7%
t 81905
 
9.2%
a 79924
 
8.9%
n 68116
 
7.6%
i 65870
 
7.4%
r 63437
 
7.1%
o 62600
 
7.0%
h 42794
 
4.8%
s 39810
 
4.4%
d 38411
 
4.3%
Other values (56) 246895
27.6%
Common
ValueCountFrequency (%)
179362
85.0%
. 13487
 
6.4%
, 5721
 
2.7%
0 2668
 
1.3%
- 1645
 
0.8%
1 1368
 
0.6%
2 1042
 
0.5%
5 830
 
0.4%
3 820
 
0.4%
' 771
 
0.4%
Other values (25) 3195
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1105434
> 99.9%
None 72
 
< 0.1%
Punctuation 70
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
179362
16.2%
e 104905
 
9.5%
t 81905
 
7.4%
a 79924
 
7.2%
n 68116
 
6.2%
i 65870
 
6.0%
r 63437
 
5.7%
o 62600
 
5.7%
h 42794
 
3.9%
s 39810
 
3.6%
Other values (73) 316711
28.7%
Punctuation
ValueCountFrequency (%)
’ 67
95.7%
‘ 3
 
4.3%
None
ValueCountFrequency (%)
é 20
27.8%
á 15
20.8%
í 8
 
11.1%
  7
 
9.7%
ó 3
 
4.2%
° 3
 
4.2%
ö 3
 
4.2%
ã 2
 
2.8%
â 2
 
2.8%
ü 2
 
2.8%
Other values (6) 7
 
9.7%

Country
Categorical

Distinct486
Distinct (%)9.7%
Missing5
Missing (%)0.1%
Memory size372.3 KiB
United States of America
1009 
Russia
 
253
Brazil
 
174
Colombia
 
151
Canada
 
127
Other values (481)
3289 

Length

Max length41
Median length37
Mean length11.139316
Min length2

Characters and Unicode

Total characters55730
Distinct characters71
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique272 ?
Unique (%)5.4%

Sample

1st rowUnited States of America
2nd rowFrance
3rd rowUnited States of America
4th rowCanada
5th rowOver the North Sea

Common Values

ValueCountFrequency (%)
United States of America 1009
 
20.1%
Russia 253
 
5.1%
Brazil 174
 
3.5%
Colombia 151
 
3.0%
Canada 127
 
2.5%
France 123
 
2.5%
India 103
 
2.1%
England 101
 
2.0%
Indonesia 99
 
2.0%
China 94
 
1.9%
Other values (476) 2769
55.3%

Length

2023-05-22T10:51:28.205167image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
of 1036
 
11.7%
united 1027
 
11.6%
states 1010
 
11.4%
america 1009
 
11.4%
russia 255
 
2.9%
brazil 174
 
2.0%
colombia 151
 
1.7%
canada 131
 
1.5%
france 125
 
1.4%
india 106
 
1.2%
Other values (459) 3806
43.1%

Most occurring characters

ValueCountFrequency (%)
a 6678
 
12.0%
e 5060
 
9.1%
i 4775
 
8.6%
t 4023
 
7.2%
3832
 
6.9%
n 3683
 
6.6%
o 2601
 
4.7%
r 2552
 
4.6%
s 2267
 
4.1%
d 1911
 
3.4%
Other values (61) 18348
32.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 43843
78.7%
Uppercase Letter 7977
 
14.3%
Space Separator 3832
 
6.9%
Other Punctuation 39
 
0.1%
Decimal Number 13
 
< 0.1%
Close Punctuation 9
 
< 0.1%
Dash Punctuation 8
 
< 0.1%
Open Punctuation 7
 
< 0.1%
Control 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 6678
15.2%
e 5060
11.5%
i 4775
10.9%
t 4023
9.2%
n 3683
8.4%
o 2601
 
5.9%
r 2552
 
5.8%
s 2267
 
5.2%
d 1911
 
4.4%
c 1695
 
3.9%
Other values (20) 8598
19.6%
Uppercase Letter
ValueCountFrequency (%)
S 1509
18.9%
A 1368
17.1%
U 1140
14.3%
C 588
 
7.4%
I 443
 
5.6%
R 425
 
5.3%
B 309
 
3.9%
P 283
 
3.5%
N 268
 
3.4%
G 237
 
3.0%
Other values (15) 1407
17.6%
Decimal Number
ValueCountFrequency (%)
1 3
23.1%
0 3
23.1%
5 2
15.4%
2 2
15.4%
7 1
 
7.7%
8 1
 
7.7%
3 1
 
7.7%
Other Punctuation
ValueCountFrequency (%)
, 19
48.7%
. 18
46.2%
& 1
 
2.6%
: 1
 
2.6%
Space Separator
ValueCountFrequency (%)
3832
100.0%
Close Punctuation
ValueCountFrequency (%)
) 9
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%
Open Punctuation
ValueCountFrequency (%)
( 7
100.0%
Control
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 51820
93.0%
Common 3910
 
7.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 6678
12.9%
e 5060
 
9.8%
i 4775
 
9.2%
t 4023
 
7.8%
n 3683
 
7.1%
o 2601
 
5.0%
r 2552
 
4.9%
s 2267
 
4.4%
d 1911
 
3.7%
c 1695
 
3.3%
Other values (45) 16575
32.0%
Common
ValueCountFrequency (%)
3832
98.0%
, 19
 
0.5%
. 18
 
0.5%
) 9
 
0.2%
- 8
 
0.2%
( 7
 
0.2%
1 3
 
0.1%
0 3
 
0.1%
5 2
 
0.1%
2 2
 
0.1%
Other values (6) 7
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 55724
> 99.9%
None 6
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 6678
 
12.0%
e 5060
 
9.1%
i 4775
 
8.6%
t 4023
 
7.2%
3832
 
6.9%
n 3683
 
6.6%
o 2601
 
4.7%
r 2552
 
4.6%
s 2267
 
4.1%
d 1911
 
3.4%
Other values (57) 18342
32.9%
None
ValueCountFrequency (%)
é 3
50.0%
è 1
 
16.7%
ó 1
 
16.7%
ã 1
 
16.7%

Hour
Real number (ℝ)

MISSING  ZEROS 

Distinct24
Distinct (%)0.7%
Missing1504
Missing (%)30.0%
Infinite0
Infinite (%)0.0%
Mean12.657534
Minimum0
Maximum23
Zeros74
Zeros (%)1.5%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2023-05-22T10:51:28.317032image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q18
median13
Q317
95-th percentile22
Maximum23
Range23
Interquartile range (IQR)9

Descriptive statistics

Standard deviation5.9909305
Coefficient of variation (CV)0.47330944
Kurtosis-0.76365172
Mean12.657534
Median Absolute Deviation (MAD)4.5
Skewness-0.20985031
Sum44352
Variance35.891248
MonotonicityNot monotonic
2023-05-22T10:51:28.419335image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
11 212
 
4.2%
9 207
 
4.1%
15 202
 
4.0%
14 201
 
4.0%
10 201
 
4.0%
19 194
 
3.9%
13 185
 
3.7%
16 184
 
3.7%
12 180
 
3.6%
17 180
 
3.6%
Other values (14) 1558
31.1%
(Missing) 1504
30.0%
ValueCountFrequency (%)
0 74
 
1.5%
1 84
1.7%
2 85
1.7%
3 63
 
1.3%
4 83
1.7%
5 62
 
1.2%
6 98
2.0%
7 156
3.1%
8 180
3.6%
9 207
4.1%
ValueCountFrequency (%)
23 120
2.4%
22 122
2.4%
21 105
2.1%
20 163
3.3%
19 194
3.9%
18 163
3.3%
17 180
3.6%
16 184
3.7%
15 202
4.0%
14 201
4.0%

Month
Categorical

Distinct12
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size348.6 KiB
December
496 
January
460 
September
451 
August
451 
November
439 
Other values (7)
2711 

Length

Max length9
Median length7
Mean length6.2789537
Min length3

Characters and Unicode

Total characters31445
Distinct characters26
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSeptember
2nd rowSeptember
3rd rowJuly
4th rowAugust
5th rowSeptember

Common Values

ValueCountFrequency (%)
December 496
9.9%
January 460
9.2%
September 451
9.0%
August 451
9.0%
November 439
8.8%
October 427
8.5%
March 426
8.5%
July 425
8.5%
June 366
7.3%
February 360
7.2%
Other values (2) 707
14.1%

Length

2023-05-22T10:51:28.517662image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
december 496
9.9%
january 460
9.2%
september 451
9.0%
august 451
9.0%
november 439
8.8%
october 427
8.5%
march 426
8.5%
july 425
8.5%
june 366
7.3%
february 360
7.2%
Other values (2) 707
14.1%

Most occurring characters

ValueCountFrequency (%)
e 4872
15.5%
r 3767
12.0%
u 2513
 
8.0%
b 2173
 
6.9%
a 2065
 
6.6%
y 1604
 
5.1%
m 1386
 
4.4%
c 1349
 
4.3%
t 1329
 
4.2%
J 1251
 
4.0%
Other values (16) 9136
29.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 26437
84.1%
Uppercase Letter 5008
 
15.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 4872
18.4%
r 3767
14.2%
u 2513
9.5%
b 2173
8.2%
a 2065
7.8%
y 1604
 
6.1%
m 1386
 
5.2%
c 1349
 
5.1%
t 1329
 
5.0%
o 866
 
3.3%
Other values (8) 4513
17.1%
Uppercase Letter
ValueCountFrequency (%)
J 1251
25.0%
A 799
16.0%
M 785
15.7%
D 496
 
9.9%
S 451
 
9.0%
N 439
 
8.8%
O 427
 
8.5%
F 360
 
7.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 31445
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 4872
15.5%
r 3767
12.0%
u 2513
 
8.0%
b 2173
 
6.9%
a 2065
 
6.6%
y 1604
 
5.1%
m 1386
 
4.4%
c 1349
 
4.3%
t 1329
 
4.2%
J 1251
 
4.0%
Other values (16) 9136
29.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 31445
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 4872
15.5%
r 3767
12.0%
u 2513
 
8.0%
b 2173
 
6.9%
a 2065
 
6.6%
y 1604
 
5.1%
m 1386
 
4.4%
c 1349
 
4.3%
t 1329
 
4.2%
J 1251
 
4.0%
Other values (16) 9136
29.1%

Interactions

2023-05-22T10:51:22.944425image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:16.656701image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:17.586976image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:18.473569image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:19.368755image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:20.465051image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:21.362427image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:22.161086image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:23.183722image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:16.830814image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:17.696142image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:18.575701image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:19.469763image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:20.602630image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:21.471122image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:22.261509image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:23.286642image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:16.972212image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:17.797030image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:18.677775image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:19.572348image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:20.726708image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:21.574361image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:22.364789image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:23.389845image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:17.083935image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:17.923325image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:18.767878image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:19.688300image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:20.834333image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:21.671861image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:22.459389image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:23.484890image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:17.198337image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:18.046704image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:18.984620image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:19.816431image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:20.946771image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:21.781123image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:22.557590image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:23.577747image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:17.297172image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:18.158451image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:19.081630image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:19.942029image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:21.052496image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:21.882437image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:22.662263image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:23.673007image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:17.393891image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:18.258320image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:19.174459image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:20.079428image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:21.160911image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:21.972688image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:22.759206image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:23.763765image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:17.491003image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:18.370091image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:19.270138image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:20.330710image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:21.263542image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:22.066701image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-22T10:51:22.852285image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Correlations

2023-05-22T10:51:28.612916image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
All AboardPassengers BoardCrew BoardAll FatalitiesPassenger FatalitiesCrew fatalitiesGroundHourMonth
All Aboard1.0000.9660.6670.7450.7810.3690.0390.0270.000
Passengers Board0.9661.0000.5030.7080.8190.2320.0210.0310.000
Crew Board0.6670.5031.0000.5220.3790.6890.099-0.0060.004
All Fatalities0.7450.7080.5221.0000.9400.681-0.0070.0250.000
Passenger Fatalities0.7810.8190.3790.9401.0000.457-0.0250.0260.000
Crew fatalities0.3690.2320.6890.6810.4571.0000.041-0.0030.000
Ground0.0390.0210.099-0.007-0.0250.0411.000-0.0220.043
Hour0.0270.031-0.0060.0250.026-0.003-0.0221.0000.010
Month0.0000.0000.0040.0000.0000.0000.0430.0101.000

Missing values

2023-05-22T10:51:23.942139image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
A simple visualization of nullity by column.
2023-05-22T10:51:24.232491image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-05-22T10:51:24.522559image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

DateTimeLocationOperatorflight_noRouteAc TypeRegistrationcn_lnAll AboardPassengers BoardCrew BoardAll FatalitiesPassenger FatalitiesCrew fatalitiesGroundSummaryCountryHourMonth
01908-09-1717:18:00Fort Myer, VirginiaMilitary - U.S. ArmyNaNDemonstrationWright Flyer IIINaN12.01.01.01.01.00.00.0During a demonstration flight, a U.S. Army flyer flown by Orville Wright nose-dived into the ground from a height of approximately 75 feet, killing Lt. Thomas E. Selfridge, 26, who was a passenger. This was the first recorded airplane fatality in history. One of two propellers separated in flight, tearing loose the wires bracing the rudder and causing the loss of control of the aircraft. Orville Wright suffered broken ribs, pelvis and a leg. Selfridge suffered a crushed skull and died a short time later.United States of America17.0September
11909-09-07NaTJuvisy-sur-Orge, FranceNaNNaNAir showWright ByplaneSC1NaN1.00.01.01.00.00.00.0Eugene Lefebvre was the first pilot to ever be killed in an air accident, after his controls jambed while flying in an air show.FranceNaNSeptember
21912-07-1206:30:00Atlantic City, New JerseyMilitary - U.S. NavyNaNTest flightDirigibleNaNNaN5.00.05.05.00.05.00.0First U.S. dirigible Akron exploded just offshore at an altitude of 1,000 ft. during a test flight.United States of America6.0July
31913-08-06NaTVictoria, British Columbia, CanadaPrivateNaNNaNCurtiss seaplaneNaNNaN1.00.01.01.00.01.00.0The first fatal airplane accident in Canada occurred when American barnstormer, John M. Bryant, California aviator was killed.CanadaNaNAugust
41913-09-0918:30:00Over the North SeaMilitary - German NavyNaNNaNZeppelin L-1 (airship)NaNNaN20.0NaNNaN14.0NaNNaN0.0The airship flew into a thunderstorm and encountered a severe downdraft crashing 20 miles north of Helgoland Island into the sea. The ship broke in two and the control car immediately sank drowning its occupants.Over the North Sea18.0September
51913-10-1710:30:00Near Johannisthal, GermanyMilitary - German NavyNaNNaNZeppelin L-2 (airship)NaNNaN28.0NaNNaN28.0NaNNaN0.0Hydrogen gas which was being vented was sucked into the forward engine and ignited causing the airship to explode and burn at 3,000 ft..German Navy's Zeppelin airships L-4 and L-5 were blown out to sea in February 1915, never to be seen again.Germany10.0October
61915-03-0501:00:00Tienen, BelgiumMilitary - German NavyNaNNaNZeppelin L-8 (airship)NaNNaN41.00.041.017.00.017.00.0Crashed into trees while attempting to land after being shot down by British and French aircraft.Belgium1.0March
71915-09-0315:20:00Off Cuxhaven, GermanyMilitary - German NavyNaNNaNZeppelin L-10 (airship)NaNNaN19.0NaNNaN19.0NaNNaN0.0Exploded and burned near Neuwerk Island, when hydrogen gas, being vented, was ignited by lightning.Germany15.0September
81916-07-28NaTNear Jambol, BulgeriaMilitary - German ArmyNaNNaNSchutte-Lanz S-L-10 (airship)NaNNaN20.0NaNNaN20.0NaNNaN0.0Crashed near the Black Sea, cause unknown.BulgeriaNaNJuly
91916-09-2401:00:00Billericay, EnglandMilitary - German NavyNaNNaNZeppelin L-32 (airship)NaNNaN22.0NaNNaN22.0NaNNaN0.0Shot down by British aircraft crashing in flames.England1.0September
DateTimeLocationOperatorflight_noRouteAc TypeRegistrationcn_lnAll AboardPassengers BoardCrew BoardAll FatalitiesPassenger FatalitiesCrew fatalitiesGroundSummaryCountryHourMonth
49982020-08-0719:14:00Calicut, IndiaAir India ExppressIX344Dubai - CalicutBoeing 737-8HGVT-AXH36323/2108190.0184.06.020.018.02.00.0The flight IX344 suffered a runway excursion while landing at Kozhikode-Calicut Airport in heavy rain. The nose section separated from the fuselage after going down a steep slope at the end of the runway. The pilot and copilot were among the dead. Low visibility, wet runway, low cloud base and poor braking action possibly contributed to the accident.India19.0August
49992020-08-2208:40:00Juba, South SudanSouth West AviaitonNaNJuba - WauAntonov 26BEX-126115088.05.03.07.04.03.00.0The cargo plane lost height shortly after departure from Juba Airport and impacted a farm near Hai Referendum about 3nm southwest of the airport. One passenger survived in critical condition. The plane was chartered by the World Food Program to transport supplies and wages to Wau and Aweil.South Sudan8.0August
50002020-09-2520:50:00Near Chuguev, UkraineMilitary - Ukraine Air ForceNaNTrainingAntonov An26SH76 yellow560827.020.07.026.019.07.00.0The military transport, crashed 1.2 miles from Chuguev air base. The plane was carrying cadets from a nearby air force university on a training flight. The crew may have reported failure of an engine prior to the accident.Ukraine20.0September
50012021-01-0914:40:00Near Jakarta, IndonesiaSriwijaya AirSJ182Jakarta - PontianakBoeing 737-524PK-CLC27323/261662.056.06.062.056.06.00.0Sriwijaya Air flight 182 was climbing through 10,900 ft., 11 nm north of Jakarta-Soekarno-Hatta International Airport, over the Java Sea when radar and radio contact was lost. The aircraft then lost height rapidly and impacted the Java Sea. Debris was located near Lancang Island.Indonesia14.0January
50022021-03-0217:05:00Pieri, SudanSouth Sudan Supreme AirlinesNaNPieri - YuaiLet L-410UVP-EHK-427490252510.08.02.010.08.02.00.0One of the engines on the aircraft failed 10 minutes after takeof. When the plane turned back, the second engine failed.Sudan17.0March
50032021-03-2818:35:00Near Butte, AlaskaSoloy HelicoptersNaNSightseeing CharterEurocopter AS350B3 EcureuilN351SH45986.05.01.05.04.01.00.0The sightseeing helicopter crashed after missing the top of a 6,000 ft mountain by just 10 - 15 ft. The crash site was near Knik glacier. The pilot, and four others were killed including Czech billionaire Petr Kellner.United States of America18.0March
50042021-05-2118:00:00Near Kaduna, NigeriaMilitary - Nigerian Air ForceNaNNaNBeechcraft B300 King Air 350iNAF203FL-89111.07.04.011.07.04.00.0While on final approach, in poor weather conditions, the aircraft crashed and burst into flames less than 10 km from Kaduna Airport. All 11 occupants were killed, incuding General Ibrahim Attahiru, Chief of Staff of the Nigerian Army.Nigeria18.0May
50052021-06-1008:00:00Near Pyin Oo Lwin, MyanmarMilitary - Myanmar Air ForceNaNNaypyidaw - AnisakanBeechcraft 1900D4610E-32514.012.02.012.011.01.00.0The plane was carrying military personnel and monks when it crashed about 300 meters from a steel plant in the Mandalay region. The plane was attempting to land in poor weather conditions and broke into three pieces.Myanmar8.0June
50062021-07-0411:30:00Patikul, Sulu, PhilippinesMilitary - Philippine Air ForceNaNCagayan de Oro-Lumbia - JoloLockheed C-130H Hercules5125512596.088.08.050.0NaNNaN3.0While attempting to land at Jolo Airport, the military transport overran the runway, struck two houses and burst into flames coming to rest on a coconut plantation.Philippines11.0July
50072021-07-0615:00:00Palana, RussiaKamchatka Aviation Enterprise251Petropavlovsk - PalanaAntonov An 26B-100RA-260851231028.022.06.028.022.06.00.0The passenger plane crashed into the top of a cliff while attempting to land in inclement weather. The debris fell into the sea. Contact was lost with the plane 10 minutes before it was to land.Russia15.0July